Materialized View Selection for a Data Warehouse Using Frequent Itemset Mining

نویسندگان

  • Mohammad Karim Sohrabi
  • Vahid Ghods
چکیده

Data warehouses are subject oriented, consolidated, integrated, and time variant repository of possibly heterogeneous data. A data warehouse is used to response to on-line analytical queries over the millions records of data in an acceptable time. Since a data warehouse often has millions of records of data, it is an important challenge how we can reduce the time of on-line analytical processing. One of the most important issues which address this problem is the view materialization. Each sub-query results an intermediate table, called virtual view, which is used to find final result of the analytical query. These virtual views often are commonly used to response to several analytical queries. We can materialize such views to prevent multiple redundant computations and thus lead to reduction in response time of queries. The constraint of storage memory on one hand, and the maintenance cost of materialized views when the source data are updated on the other hand, cause that it is impossible to materialize all or even large part of views. Therefore, selection of a proper set of views to materialization plays a major role in performance. There are many methods of view selection to materialization which uses different techniques and frameworks to select optimal set of views to materialization. In this paper, we present a new efficient method to conduct selecting proper set of views to materialization using a frequent itemset mining approach. In our algorithm, the set of given queries is transformed to a transaction database where a transaction corresponds to a query and items of a transaction are the original query’s predicates. Our performance study showed that this algorithm outperformed substantially the best former algorithms.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

بهبود الگوریتم انتخاب دید در پایگاه داده‌‌ تحلیلی با استفاده از یافتن پرس‌ وجوهای پرتکرار

A data warehouse is a source for storing historical data to support decision making. Usually analytic queries take much time. To solve response time problem it should be materialized some views to answer all queries in minimum response time. There are many solutions for view selection problems. The most appropriate solution for view selection is materializing frequent queries. Previously posed ...

متن کامل

A Study on Answering a Data Mining Query Using a Materialized View

One of the classic data mining problems is discovery of frequent itemsets. This problem particularly attracts database community as it resembles traditional database querying. In this paper we consider a data mining system which supports storing of previous query results in the form of materialized data mining views. While numerous works have shown that reusing results of previous frequent item...

متن کامل

A Solution to View Management to Build a Data Warehouse

Several techniques exist to select and materialize a proper set of data in a suitable structure that manage the queries submitted to the online analytical processing systems. These techniques are called view management techniques, which consist of three research areas: 1) view selection to materialize, 2) query processing and rewriting using the materialized views, and 3) maintaining materializ...

متن کامل

Data Access Paths for Frequent Itemsets Discovery

Many frequent itemset discovery algorithms have been proposed in the area of data mining research. The algorithms exhibit significant computational complexity, resulting in long processing times. Their performance is also dependent on source data characteristics. We argue that users should not be responsible for choosing the most efficient algorithm to solve a particular data mining problem. In...

متن کامل

An Association Rule Mining for Materialized View Selection and View Maintanance

Data warehouse (DW) is a repository with query interface in support of Decision support systems. DW required answering many complex queries, managerial level queries and analytical queries, needing to develop advanced computing techniques. The DW system process involving data modeling, ETL process, query interface and reporting system. Materialized views (MV) are the pre calculated views which ...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:
  • JCP

دوره 11  شماره 

صفحات  -

تاریخ انتشار 2016